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but the information in each data is less and vice versa. This works aims to 
investigate the effects of using very short duration PRPD data on the 
Keywords: accuracy of PD pattern recognition. The results conclude that machine 
learning models such as Artificial Neural Network (ANN) and Support 
Vector Machine (SVM) are robust enough such that reduction of PRPD 
duration from 15-seconds to 1-second causes less than 5 % drop in the 
PRPD classification accuracy. However, this is only true for noise free condition. 
When the same PD data is overlapped with random noise, the classification 
accuracy suffers a significant reduction up to 19%. Therefore, longer PRPD 
duration is recommended to withstand the effects of noise contamination. 
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1. INTRODUCTION 

Insulation failure in electrical power system components will cause catastrophic damage. Therefore, 
it is important to frequently monitor the insulation quality. Since PD measurement is a nondestructive test, it 
is widely used for insulation condition assessment [1-3]. PD is defined as electrical discharge that partially 
bridges the insulation according to IEC 60270 [4]. Despite only partially bridging the insulation, PD will 
cause eventual insulation breakdown if left undetected. If PD can be detected at an incipient stage, utility 
companies can avoid expensive electrical equipment failures [5, 6]. Each insulation defect has its own unique 
discharge attributes, which can be used to train machine learning models to identify the defect type based on 
the measured PD pattern [7]. Such PD classifiers will greatly facilitate the insulation condition monitoring of 
electrical power components at low cost and efficient manner. 

PRPD is the most widely used representation for PD [8, 9]. In order to obtain a PRPD data 
representation, a PD detector is required to measure a continuous stream of PD pulses. Each individual PD 
pulse will be quantified into the phase angle (ø), charge magnitude (q) and the number of PD occurrence (n). 
Because of this, PRPD is also known as ¢-g-n pattern [10]. PRPD can be represented as a 3-dimensional data 
array, 3D figure, or 2D image with color contour. 
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The PD measurement duration to generate a single PRPD representation is not standardized and 
different duration has been used by researchers for PD classification related research. For example, 300 
seconds [11], 120 seconds [12], 60 seconds [9, 13], 50 seconds [14], and 3 seconds only [15]. This work aims 
to investigate the effects of using very short PRPD duration on PD classification accuracy. A duration of 1- 
second to 15-seconds was chosen to test the robustness of machine learning models to recognize the PD 
source when provided with just 1 to 15 seconds of PRPD pattern. 

Two groups of PD data source were used for this work. The first group consists of 3 lab fabricated 
insulation materials, which provide a more consistent PRPD pattern while the second group consists of 5 
PRPD pattern measured from Cross-linked Polyethylene (XLPE) cable joint defects, which provides a more 
inconsistent PRPD pattern. Comparing the results of both groups will give a more comprehensive view of the 
effect of reducing the PRPD duration. 

The PRPD duration directly correlates to the number of PD occurring per measurement. When 
features extracted from PRPD pattern were used for classification, the accuracy depends on a variety of 
factors. Since this work is focusing on examining the effects of shorter PRPD duration, the other factors are 
kept constant. In other words, the type of feature extraction performed, the number of training and test data, 
as well as the classifier hyperparameters remains the same while only varying the PRPD duration to observe 
its effects on PD pattern recognition accuracy. 

The remainder of the paper is organized as follows; Section II describes the overall experiment 
setup, which covers the PD measurement setup, PD source preparation, and random noise data used. Section 
III describes the PD classification procedure, which includes feature extraction and PD classifier. The results 
& discussion are included in Section IV while Section V provides the conclusion for this work. 


2. RESEARCH METHOD 
2.1. PD measurement setup 

A commercial PD detector, which complies with the IEC 60270 standard, was used in this work. 
The PD detector is able to display the PRPD pattern in real time and the data can be exported to a PC for 
further processing. A block diagram of the measurement setup is shown in Figure 1. 

The HV source is a step-up transformer capable of supplying up to 200 kV. The measuring capacitor 
measures the voltage supplied. The coupling capacitor will transfer an apparent charge to the test object to 
stabilize the voltage whenever it detects a voltage drop due to PD occurrence. This data is passed to the PD 
detector and the USB controller handles the data transfer between the PD detector and the PC. 
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Figure 1. Block diagram of PD measurement setup 


2.2. PD source preparation 

Two groups of PD sources were prepared and compared in this work. PD Group 1 consists of 3 
classes of PD which are void, corona and surface discharge measured from lab fabricated low-density 
polyethylene (LDPE). The details of the sample preparation and measurement condition can be found in [16]. 
PD Group 2 consists of 5 classes of PD source measured from XLPE cable joints with artificial defects. The 
artificial defects include incision on insulation layer, metallic particles on insulation layer, rough edges at 
semiconductor layer, air gap at semiconductor layer and off-axis joint installation. More information about 
the sample preparation can be found in [17]. 

PD Group | consist of 66 PD data where every 3 classes have 22 data each. PD Group 2 consist of 
100 PD data where every 5 classes have 20 data each. Figure 2 shows one example of void PRPD from PD 
Group while Figure 3 shows one example of incision defect from PD Group 2 at 1-second and 15-seconds 
duration. The x-axis represents the phase angle of the PD occurrence, the y-axis represents the charge 
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magnitude of the PD while the number of PD, also known as the PD intensity is represented by the color 


gamut at the sidebar. 


PRPD duration. 
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Figure 2. Void PRPD from PD Group 1, (a) 1-second (b) 15-seconds 


The 15-seconds duration PRPD is made up of a continuous combination of 15 units of 1-second 
PRPD. The rationale for using these two groups of PD source is to better observe the impact of using shorter 


PD Group 1 has a more consistent PD pattern and just 3 different classes. Conversely, PD Group 2 
has a more inconsistent PD pattern and 5 different classes. With the same PRPD duration, it is expected that 
PD Group 1 will be easier to classify and hence SVM and ANN can achieve higher classification accuracy. 
When reducing the PRPD duration for PD Group 1, it can be seen that although the PD intensity is different 
for 1 second and 15 seconds, the general shape is similar. Since the opposite is true for Group 2, this will 
make it more challenging to be classified when the PRPD duration is reduced. 


(a) 





(b) 


Figure 3. Incision defect from PD Group 2, (a) 1-second (b) 15-seconds 





In order to investigate the robustness of the PD classifier under noise contamination, random noise 


measured from ground interference was used to overlap the clean PRPD pattern. For example, T duration of 
PRPD will be overlapped with T duration of random noise to generate a noise contaminated PRPD data. The 
PD classifier will be trained using clean PRPD data but tested against contaminated PRPD data. An example 


of random noise PRPD is shown in Figure 4. 
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Figure 4. Random noise for 15-seconds 


3. PD CLASSIFICATION 
3.1. Feature extraction 

The raw PD data and PRPD is too large to be used as input feature to train the PD classifier. Hence, 
feature extraction is required to obtain a useful representation of the PRPD pattern. The extracted features are 
also known as “PD fingerprint”. The PRPD can be sorted into two primary distributions Hn(@) and Hqn(ọ). 
Hn(ọ) is a 2-D plot of PD intensity vs phase occurrence while Hqn(@) is a 2-D plot of PD charge magnitude 
vs phase occurrence. These two distributions can then be divided into another two separate distributions 
based on the positive and negative half of the phase cycle. Four statistical features such as Mean, Variance, 
Kurtosis, and Skewness can be calculated from all four distributions to form a total of 16 features for each 
PRPD data. The Kurtosis and Skewness can be calculated by using the following formulas: 


EX (xiu) F (xi) 
Kurt =a eS 1 
urtosis ote f(x) 3 (1) 
EN iu)? f 
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ewness a3 ON fD (2) 


where N is the total data size, f(x;) is the function of interest, and x; is the individual discrete value of the 
distribution. A complete mathematical description of Kurtosis and Skewness can be found in [18-20]. 


3.2. PD classifier 

Two commonly used machine learning classifier was used for this work as the PD classifier, ANN 
[21-23] and SVM [24-27]. Usually, the total input data will be divided into training & testing data. The 
classifier will be trained using the training data and tested using the testing data. For this work, the 
performance of the PD classifier was evaluated using K-fold cross-validation. The input data were randomly 
divided into K number of sets, the first set will be used for testing while the other sets will be used for 
training. This process was repeated K number of times where each set will take a turn to be used once as 
testing data. The average classification accuracy is then calculated. For PD Group 1, 11-fold cross-validation 
was used while 10-fold cross-validation was used for PD Group 2. This K number was chosen so that each 
fold contains the same number of data from each class. 

The benefit of using K-fold cross-validation is to avoid overfitting and selection bias. In order to 
observe the performance of the PD classifier when using noise contaminated data, the classifier was trained 
using clean input data, and the test data was overlapped with noise data prior to testing. This will properly 
gauge the capability of the PD classifier to recognize contaminated input data that was not seen before during 
the training process. 


4. RESULTS AND DISCUSSION 

The effects of reducing PRPD duration on both PD Group 1 and PD Group 2 as well as the effects 
of noise contamination are shown in Figure 5 and Figure 6 where the x-axis represents the PRPD duration 
while the y-axis represents the average classification accuracy. 
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Figure 5. ANN average classification accuracy for (a) PD group | and (b) PD group 2 


Under noise free condition, using shorter PRPD duration barely affects the ANN classification 
accuracy of PD Group 1. The average accuracy is 94 % with only 1.36 % standard deviation. For PD Group 
2, the classification accuracy suffers a minor reduction of 5 % when the PRPD duration reduces from 15- 
seconds to 1-second. This can be explained by the relatively consistent PRPD pattern of PD group 1, hence 
the shorter PRPD duration does not significantly affect the PD classification accuracy compared to PD group 
2. The overall small reduction in classification accuracy shows the robustness of ANN and SVM in dealing 
with shorter PRPD duration as input data. 

When noise contamination is taken into consideration, the ANN classification accuracy of PD group 
1 deteriorates more severely compared to PD group 2. Due to the low variation of PRPD pattern in PD group 
1, the PD classifier for PD Group 1 is not good in generalizing. Hence, any variation in the input data will 
cause a larger reduction in classification accuracy. For PD group 2, there is an obvious trend where higher 
PRPD duration results in better classification accuracy. 

A similar behavior is observed for the SVM classifier where shorter PRPD duration affects PD 
Group 2 more severely compared to Group 1. However, the overall accuracy of SVM is lower than ANN. For 
PD Group 1 and Group 2, SVM has an average of 13 % and 19 % lower accuracy compared to ANN under 
noise contamination. 
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Figure 6. SVM average classification accuracy for (a) PD group | and (b) PD group 2 


5. CONCLUSION 

The effects of using shorter duration PRPD for PD classification has been successfully investigated. 
Based on the results obtained, it can be concluded that PD classification accuracy of PD source measured 
from lab fabricated insulation materials will not be significantly affected by using shorter PRPD duration. 
However, this is only true for lab fabricated materials. For more realistic and practical PD measured from 
power system components such as XLPE cable joints, using longer PRPD duration can improve classification 
accuracy of ANN and SVM. Using longer PRPD duration also enables the PD classifier to be less susceptible 
to classification accuracy reduction in dealing with noise contamination. 
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